original story
Mechanistic Interpretability of Socio-Political Frames in Language Models
This paper explores the ability of large language models to generate and recognize deep cognitive frames, particularly in socio-political contexts. We demonstrate that LLMs are highly fluent in generating texts that evoke specific frames and can recognize these frames in zero-shot settings. Inspired by mechanistic interpretability research, we investigate the location of the `strict father' and `nurturing parent' frames within the model's hidden representation, identifying singular dimensions that correlate strongly with their presence. Our findings contribute to understanding how LLMs capture and express meaningful human concepts.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Germany > Berlin (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Media (0.93)
- Government (0.68)
Help Me Write a Story: Evaluating LLMs' Ability to Generate Writing Feedback
Rashkin, Hannah, Clark, Elizabeth, Huot, Fantine, Lapata, Mirella
Can LLMs provide support to creative writers by giving meaningful writing feedback? In this paper, we explore the challenges and limitations of model-generated writing feedback by defining a new task, dataset, and evaluation frameworks. To study model performance in a controlled manner, we present a novel test set of 1,300 stories that we corrupted to intentionally introduce writing issues. We study the performance of commonly used LLMs in this task with both automatic and human evaluation metrics. Our analysis shows that current models have strong out-of-the-box behavior in many respects -- providing specific and mostly accurate writing feedback. However, models often fail to identify the biggest writing issue in the story and to correctly decide when to offer critical vs. positive feedback.
- North America > United States (0.93)
- Asia > Middle East > UAE (0.46)
CoRE: Condition-based Reasoning for Identifying Outcome Variance in Complex Events
Vallurupalli, Sai, Ferraro, Francis
Knowing which latent conditions lead to a particular outcome is useful for critically examining claims made about complex event outcomes. Identifying implied conditions and examining their influence on an outcome is challenging. We handle this by combining and augmenting annotations from two existing datasets consisting of goals and states, and explore the influence of conditions through our research questions and Condition-based Reasoning tasks. We examine open and closed LLMs of varying sizes and intent-alignment on our reasoning tasks and find that conditions are useful when not all context is available. Models differ widely in their ability to generate and identify outcome-variant conditions which affects their performance on outcome validation when conditions are used to replace missing context. Larger models like GPT-4o, are more cautious in such less constrained situations.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Maryland > Baltimore County (0.14)
- North America > United States > Maryland > Baltimore (0.14)
- (14 more...)
CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation
Chen, Tong, Asai, Akari, Mireshghallah, Niloofar, Min, Sewon, Grimmelmann, James, Choi, Yejin, Hajishirzi, Hannaneh, Zettlemoyer, Luke, Koh, Pang Wei
Evaluating the degree of reproduction of copyright-protected content by language models (LMs) is of significant interest to the AI and legal communities. Although both literal and non-literal similarities are considered by courts when assessing the degree of reproduction, prior research has focused only on literal similarities. To bridge this gap, we introduce CopyBench, a benchmark designed to measure both literal and non-literal copying in LM generations. Using copyrighted fiction books as text sources, we provide automatic evaluation protocols to assess literal and non-literal copying, balanced against the model utility in terms of the ability to recall facts from the copyrighted works and generate fluent completions. We find that, although literal copying is relatively rare, two types of non-literal copying -- event copying and character copying -- occur even in models as small as 7B parameters. Larger models demonstrate significantly more copying, with literal copying rates increasing from 0.2% to 10.5% and non-literal copying from 2.3% to 6.9% when comparing Llama3-8B and 70B models, respectively. We further evaluate the effectiveness of current strategies for mitigating copying and show that (1) training-time alignment can reduce literal copying but may increase non-literal copying, and (2) current inference-time mitigation methods primarily reduce literal but not non-literal copying.
- Asia > Singapore (0.04)
- North America > United States > Alabama (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (4 more...)
- Law > Intellectual Property & Technology Law (1.00)
- Leisure & Entertainment (0.94)
Vital: The Future of Healthcare
Vital: The Future of Healthcare is an anthology of short stories. Vital has already gathered stories from leading futurist writers, weaving together disparate visions of what comes next in health and health science. Our visions of the future -- whether dark or hopeful, thrilling or mundane -- have always challenged us to examine our world. What challenges will we face? Vital: The Future of Healthcare aims to explore these questions as they relate to humanity's physical and mental well-being.
- North America > United States > New York (0.05)
- South America > Bolivia (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- (7 more...)
- Health & Medicine > Consumer Health (0.49)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.48)
This text-generation algorithm is supposedly so good it's frightening. Judge for yourself.
The best weapons are secret weapons. Freed from the boundaries of observable reality, they can hold infinite power and thus provoke infinite fear -- or hope. In World War II, as reality turned against them, the Nazis kept telling Germans about the Wunderwaffe about to hit the front lines -- "miracle weapons" that would guarantee victory for the Reich. The Stealth Bomber's stealth was not just about being invisible to radar -- it was also about its capabilities being mysterious to the Soviets. And whatever the Russian "dome of light" weapon is and those Cuban "sonic attacks" are, they're all terrifying.
- Europe > Germany (0.49)
- Asia > Russia (0.29)
- North America > United States > Utah (0.04)
- (5 more...)
- Media > News (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.95)
Counterfactual Story Reasoning and Generation
Qin, Lianhui, Bosselut, Antoine, Holtzman, Ari, Bhagavatula, Chandra, Clark, Elizabeth, Choi, Yejin
Counterfactual reasoning requires predicting how alternative events, contrary to what actually happened, might have resulted in different outcomes. Despite being considered a necessary component of AI-complete systems, few resources have been developed for evaluating counterfactual reasoning in narratives. In this paper, we propose Counterfactual Story Rewriting: given an original story and an intervening counterfactual event, the task is to minimally revise the story to make it compatible with the given counterfactual event. Solving this task will require deep understanding of causal narrative chains and counterfactual invariance, and integration of such story reasoning capabilities into conditional language generation models. We present TimeTravel, a new dataset of 29,849 counterfactual rewritings, each with the original story, a counterfactual event, and human-generated revision of the original story compatible with the counterfactual event. Additionally, we include 80,115 counterfactual "branches" without a rewritten storyline to support future work on semi- or un-supervised approaches to counterfactual story rewriting. Finally, we evaluate the counterfactual rewriting capacities of several competitive baselines based on pretrained language models, and assess whether common overlap and model-based automatic metrics for text generation correlate well with human scores for counterfactual rewriting.
'Assassin's Creed' is becoming an anime series
Less than a year since the release of the Assassin's Creed film adaptation, Ubisoft is set to revisit the world of its hit game franchise in the form of an anime series. Producer Adi Shankar claims the show will be his next project, after Netlifx's Castlevania -- making him the go-to guy for animated video game adaptations. Shankar took to Facebook to make the announcement, adding that Ubisoft approached him to create an "original story." That's all we know about the project thus far. Seeing as Shankar managed to assemble an eye-catching roster of talent for Castlevania (including comic book scribe Warren Ellis and Adventure Time's Kevin Kolde) it will be interesting to see who he calls on this time around.
- Leisure & Entertainment (0.70)
- Information Technology > Services (0.62)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Games > Computer Games (0.76)
Watch A.I. Artificial Intelligence online - Amazon Video
Begun by the legendary and often demanding Stanley Kubrick and finished by the equally legendary Steven Spielberg, AI Artificial Intelligence is a futuristic retelling of the classic 1881 Florentine Italian tale Pinocchio by Carlo Collodi. Of course most modern children and adults are only familiar with the Disney animated rendition which can be considered a Readers Digest Condensed version of the original story. Kubrick is often demanding in that if he can't make it happen and appear the way he wants it, he will abandon an idea or even an entire project. For instance, since he could not use the available technology at the time, to recreate Stephen King's topiaries come to life in The Shining, he abandoned the idea to a daunting maze. Similarly, he felt he did not have the creative team necessary to replicate a robot boy so real that it is nearly human, not without using a human child actor, but because of ET and Jurassic Park, he did feel his friend Spielberg was up to the task. The ambitious result is one of the best science fiction films ever made, one which I believe stands shoulders to Kubrick's 2001 A Space Odyssey.
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
You Ain't Paleo if You Don't Eat Bark, Plus the Week's Other Revelations
Editor's note: We're proud to bring NextDraft--the most righteous, most essential newsletter on the web--to WIRED.com. Every Friday you'll get a roundup of the week's most popular must-read stories from around the internet, courtesy of mastermind Dave Pell. Millions of homes now have guests who never leave. With names like Cortana, Siri, and Alexa, these always-listening, all-knowing, just about always female-voiced assistants have become like members of the family. They're selling like crazy, they're interacting with your children (maybe more than you are), and a new suite of artificially intelligent assistant bots are being built into baby monitors so they can have an influence on your kids right from the start. WaPo's Michael S. Rosenwald wonders if anyone is focused on how millions of kids are being shaped by know-it-all voice assistants.
- Health & Medicine > Consumer Health (0.72)
- Health & Medicine > Therapeutic Area > Neurology (0.50)